Efficient Processing of Ad-Hoc Top-k Aggregate Queries in OLAP
نویسندگان
چکیده
In this paper, we develop a principled framework for efficient processing of ad-hoc top-k (ranking) aggregate queries in OLAP. Such queries provide the k groups with the highest aggregates to decision makers. Essential support of top-k aggregate queries is lacking in current RDBMSs, which process such queries in a naı̈ve and overkill materialize-group-sort scheme, therefore can be prohibitively inefficient. Our new framework is based on two fundamental properties, the Group-Ranking and TupleRanking Principles. The principles dictate group-ordering and tuple-ordering requirement that together guide the query processor toward the optimal aggregate query processing. To realize the requirements, we propose a new execution model and address the challenges of implementing new query operators, enabling efficient top-k aggregate query plans that are both group-aware and rank-aware. The experimental study validates our framework by demonstrating orders of magnitude performance improvement in the new query plans, compared with the traditional approach.
منابع مشابه
ارائه روشی پویا جهت پاسخ به پرسوجوهای پیوسته تجمّعی اقتضایی
Data Streams are infinite, fast, time-stamp data elements which are received explosively. Generally, these elements need to be processed in an online, real-time way. So, algorithms to process data streams and answer queries on these streams are mostly one-pass. The execution of such algorithms has some challenges such as memory limitation, scheduling, and accuracy of answers. They will be more ...
متن کاملNovel Techniques for Data Warehousing and Online Analytical Processing in Emerging Applications
A data warehouse is a collection of data for supporting of decision making process. Data cubes and on-line analytical processing(OLAP) have become very popular techniques to help users analyze data in a warehouse. Even though previous studies on a data warehouse and data cube have been proposed and developed, as new applications emerging, there are still technical challenges which have not been...
متن کاملEvaluation of Ad Hoc OLAP: In-Place Computation
Large scale data analysis and mining activities, such as identifying interesting trends, making unusual patterns to stand out and verifying hypotheses, require sophisticated information extraction queries. Being able to express these data mining queries concisely is of major importance not only from the user’s, but also from the system’s point of view. Recent research in OLAP has focused on dat...
متن کاملAn efficient, robust method for processing of partial top-k/bottom-k queries using the RD-Tree in OLAP
Online analytical processing (OLAP) is a widely used technology for facilitating decision support applications. In the paper, we consider partial aggregation queries, especially for partial top-k/bottom-k, which retrieve the top/bottom-k records among the specified cells of the given query. For the efficient processing of partial ranking queries, this paper proposes a set of algorithms using th...
متن کاملEvaluation of Top-k OLAP Queries Using Aggregate R-Trees
A top-k OLAP query groups measures with respect to some abstraction level of interesting dimensions and selects the k groups with the highest aggregate value. An example of such a query is “find the 10 combinations of product-type and month with the largest sum of sales”. Such queries may also be applied in a spatial database context, where objects are augmented with some measures that must be ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005